Goto

Collaborating Authors

 neural tangent kernel perspective


Why Do Deep Residual Networks Generalize Better than Deep Feedforward Networks? --- A Neural Tangent Kernel Perspective

Neural Information Processing Systems

Deep residual networks (ResNets) have demonstrated better generalization performance than deep feedforward networks (FFNets). However, the theory behind such a phenomenon is still largely unknown.


Review for NeurIPS paper: Why Do Deep Residual Networks Generalize Better than Deep Feedforward Networks? --- A Neural Tangent Kernel Perspective

Neural Information Processing Systems

Additional Feedback: ### On my overall decision I am willing to largely upgrade my decision, if the authors can provide strong evidence that's easy to check (i.e. "safety checks") to support the correctness of their propositions/theorems. But since the size m of the hidden layers becomes infinite, the set of weights tends to a fixed limiting distribution: the same for all layers. Therefore, when m goes to infinity, the time-varying component gets smoothed out. So, when L now becomes infinite, we exactly recover an unrolled, 1-layer recurrent neural network. By Representer theorem - By the representer theorem - Fig.2, caption: CIFAR102 - CIFAR2 Reply to author response ----------------------------- Thank you for the additional plots provided in your response, which indeed nicely confirm your main theorems.


Review for NeurIPS paper: Why Do Deep Residual Networks Generalize Better than Deep Feedforward Networks? --- A Neural Tangent Kernel Perspective

Neural Information Processing Systems

After the thorough discussion among the reviewers, there is a consensus that this is a good paper that warrants acceptance. There were some skepticisms in the initial reviews, but the authors have provided a rebuttal which addressed most of the major concerns. The reviewers have updated their reviews/scores accordingly. Hence, the paper is accepted as a poster. Based on my own judgement, the presentation of this paper should be improved in the camera-ready version.


Why Do Deep Residual Networks Generalize Better than Deep Feedforward Networks? --- A Neural Tangent Kernel Perspective

Neural Information Processing Systems

Deep residual networks (ResNets) have demonstrated better generalization performance than deep feedforward networks (FFNets). However, the theory behind such a phenomenon is still largely unknown. Specifically, we first show that under proper conditions, as the width goes to infinity, training deep ResNets can be viewed as learning reproducing kernel functions with some kernel function. We then compare the kernel of deep ResNets with that of deep FFNets and discover that the class of functions induced by the kernel of FFNets is asymptotically not learnable, as the depth goes to infinity. In contrast, the class of functions induced by the kernel of ResNets does not exhibit such degeneracy. Our discovery partially justifies the advantages of deep ResNets over deep FFNets in generalization abilities.


Unsupervised Shape Completion via Deep Prior in the Neural Tangent Kernel Perspective

Chu, Lei, Pan, Hao, Wang, Wenping

arXiv.org Artificial Intelligence

We present a novel approach for completing and reconstructing 3D shapes from incomplete scanned data by using deep neural networks. Rather than being trained on supervised completion tasks and applied on a testing shape, the network is optimized from scratch on the single testing shape, to fully adapt to the shape and complete the missing data using contextual guidance from the known regions. The ability to complete missing data by an untrained neural network is usually referred to as the deep prior. In this paper, we interpret the deep prior from a neural tangent kernel (NTK) perspective and show that the completed shape patches by the trained CNN are naturally similar to existing patches, as they are proximate in the kernel feature space induced by NTK. The interpretation allows us to design more efficient network structures and learning mechanisms for the shape completion and reconstruction task. Being more aware of structural regularities than both traditional and other unsupervised learning-based reconstruction methods, our approach completes large missing regions with plausible shapes and complements supervised learning-based methods that use database priors by requiring no extra training data set and showing flexible adaptation to a particular shape instance.